Single Channel Speech Music Separation Using Nonnegative Matrix Factorization with Sliding Windows and Spectral Masks

نویسندگان

  • Emad M. Grais
  • Hakan Erdogan
چکیده

A single channel speech-music separation algorithm based on nonnegative matrix factorization (NMF) with sliding windows and spectral masks is proposed in this work. We train a set of basis vectors for each source signal using NMF in the magnitude spectral domain. Rather than forming the columns of the matrices to be decomposed by NMF of a single spectral frame, we build them with multiple spectral frames stacked in one column. After observing the mixed signal, NMF is used to decompose its magnitude spectra into a weighted linear combination of the trained basis vectors for both sources. An initial spectrogram estimate for each source is found, and a spectral mask is built using these initial estimates. This mask is used to weight the mixed signal spectrogram to find the contributions of each source signal in the mixed signal. The method is shown to perform better than the conventional NMF approach.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Block Nonnegative Matrix Factorization for Single Channel Source Separation

Nonnegative Matrix Factorization (NMF) [1, 2] has been widely used in audio research, e.g. automatic music transcription [3], musical source separation [4], and speech enhancement [5]. The key strategy for applying NMF to audio-related tasks is to find a lower rank representation of the Short Time Fourier Transformed (STFT) input signal and use the basis vectors as dictionaries. For example, in...

متن کامل

Single-Channel Mixture Decomposition Using Bayesian Harmonic Models

We consider the source separation problem for single-channel music signals. After a brief review of existing methods, we focus on decomposing a mixture into components made of harmonic sinusoidal partials. We address this problem in the Bayesian framework by building a probabilistic model of the mixture combining generic priors for harmonicity, spectral envelope, note duration and continuity. E...

متن کامل

Bayesian group sparse learning for music source separation

Nonnegative matrix factorization (NMF) is developed for parts-based representation of nonnegative signals with the sparseness constraint. The signals are adequately represented by a set of basis vectors and the corresponding weight parameters. NMF has been successfully applied for blind source separation and many other signal processing systems. Typically, controlling the degree of sparseness a...

متن کامل

Bayesian factorization and selection for speech and music separation

This paper proposes a new Bayesian nonnegative matrix factorization (NMF) for speech and music separation. We introduce the Poisson likelihood for NMF approximation and the exponential prior distributions for the factorized basis matrix and weight matrix. A variational Bayesian (VB) EM algorithm is developed to implement an efficient solution to variational parameters and model parameters for B...

متن کامل

Adaptation of Speaker-Specific Bases in Non-Negative Matrix Factorization for Single Channel Speech-Music Separation

This paper introduces a speaker adaptation algorithm for nonnegative matrix factorization (NMF) models. The proposed adaptation algorithm is a combination of Bayesian and subspace model adaptation. The adapted model is used to separate speech signal from a background music signal in a single record. Training speech data for multiple speakers is used with NMF to train a set of basis vectors as a...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011